59 research outputs found
Expert knowledge for computerized ECG interpretation
In this study, two main questions are addressed: (1) Can the time consuming and
cumbersome development and refinement of (heuristic) ECG classifiers be alleviated, and (2)
Is it possible to increase diagnostic performance of ECG computer programs by combining
knowledge from multiple sources?
Chapters 2 and 3 are of an introductory character. In Chapter 2, the measurement part of
MEANS is described and evaluated. This research largely depends on the earlier work of
Talman [11]. In Chapter 3, different methods of diagnostic ECG classification are described and
their pros and cons discussed. The issue is raised whether or not the ECG should be classified
using as much prior information as possible, and our position is made clear.
The first question~ how to ease the transfer of cardiological knowledge into computer
algorithms, is addressed in Chapters 4 and 5. The development and refinement of heuristic ECG
classifiers is impeded by two problems: (1) It generally requires a computer expert to translate
the cardiologist's reasoning into computer language without the average cardiologist being able
to verify whether his diagnostic intentions were properly realized, and (2) The classifiers are
often so complex as to obscure insight into their doings when a particular case is processed by
the classification program. To circumvent these problems. we developed a dedicated language.
DTL (Decision Tree Language), and an interpreter and compiler of that language. In Chapter
4, a comprehensive description of the DTL environment is given. In Chapter 5, the use of the
environment to optimize MEANS, following a procedure of stepwise refmement, is described The second question, whether it is feasible to combine knowledge from multiple sources in
order to increase diagnostic performance of an ECG computer program, is explored from several
perspectives in Chapters 6 tlrrough
QT dispersion as an attribute of T-loop morphology
BACKGROUND: The suggestion that increased QT dispersion (QTD) is due to
increased differences in local action potential durations within the
myocardium is wanting. An alternative explanation was sought by relating
QTD to vectorcardiographic T-loop morphology. METHODS AND RESULTS: The T
loop is characterized by its amplitude and width (defined as the spatial
angle between the mean vectors of the first and second halves of the
loop). We reasoned that small, wide ("pathological") T loops produce
larger QTD than large, narrow ("normal") loops. To quantify the
relationship between QTD and T-loop morphology, we used a program for
automated analysis of ECGs and a database of 1220 standard simultaneous
12-lead ECGs. For each ECG, QT durations, QTD, and T-loop parameters were
computed. T-loop amplitude and width were dichotomized, with 250 microV
(small versus large amplitudes) and 30 degrees (narrow versus wide loops)
taken as thresholds. Over all 1220 ECGs, QTDs were smallest for large,
narrow T loops (54.2+/-27.1 ms) and largest for small, wide loops (69.
5+/-33.5 ms; P<0.001). CONCLUSIONS: QTD is an attribute of T-loop
morphology, as expressed by T-loop amplitude and width
Minimum bandwidth requirements for recording of pediatric electrocardiograms
BACKGROUND: Previous studies that determined the frequency content of the
pediatric ECG had their limitations: the study population was small or the
sampling frequency used by the recording system was low. Therefore,
current bandwidth recommendations for recording pediatric ECGs are not
well founded. We wanted to establish minimum bandwidth requirements using
a large set of pediatric ECGs recorded at a high sampling rate. METHODS
AND RESULTS: For 2169 children aged 1 day to 16 years, a 12-lead ECG was
recorded at a sampling rate of 1200 Hz. The averaged beats of each ECG
were passed through digital filters with different cut off points (50 to
300 Hz in 25-Hz steps). We measured the absolute errors in maximum QRS
amplitude for each simulated bandwidth and determined the percentage of
records with an error >25 microV. We found that in any lead, a bandwidth
of 250 Hz yields amplitude errors 95% of the children <1
year. For older children, a gradual decrease in ECG frequency content was
demonstrated. CONCLUSIONS: We recommend a minimum bandwidth of 250 Hz to
record pediatric ECGs. This bandwidth is considerably higher than the
previous recommendation of 150 Hz from the American Heart Association
Training text chunkers on a silver standard corpus: Can silver replace gold?
Background: To train chunkers in recognizing noun phrases and verb phrases in biomedical text, an annotated corpus is required. The creation of gold standard corpora (GSCs), however, is expensive and time-consuming. GSCs therefore tend to be small and to focus on specific subdomains, which limits their usefulness. We investigated the use of a silver standard corpus (SSC) that is automatically generated by combining the outputs of multiple chunking systems. We explored two use scenarios: one in which chunkers are trained on an SSC in a new domain for which a GSC is not available, and one in which chunkers are trained on an available, although small GSC but supplemented with an SSC.Results: We have tested the two scenarios using three chunkers, Lingpipe, OpenNLP, and Yamcha, and two different corpora, GENIA and PennBioIE. For the first scenario, we showed that the systems trained for noun-phrase recognition on the SSC in one domain performed 2.7-3.1 percenta
Consistency of systematic chemical identifiers within and between small-molecule databases
Background: Correctness of structures and associated metadata within public and commercial chemical databases greatly impacts drug discovery research activities such as quantitative structure-property relationships modelling and compound novelty checking. MOL files, SMILES notations, IUPAC names, and InChI strings are ubiquitous file formats and systematic identifiers for chemical structures. While interchangeable for many cheminformatics purposes there have been no studies on the inconsistency of these structure identifiers due to various approaches for data integration, including the use of different software and different rules for structure standardisation. We have investigated the consistency of systematic identifiers of small molecules within and between some of the commonly used chemical resources, with and without structure standardisation. Results: The consistency between systematic chemical identifiers and their corresponding MOL representation varies greatly between data sources (37.2%-98.5%). We observed the lowest overall consistency for MOL-IUPAC names. Disregarding stereochemistry increases the consistency (84.8% to 99.9%). A wide variation in consistency also exists between MOL representations of compounds linked via cross-references (25.8% to 93.7%). Removing stereochemistry improved the consistency (47.6% to 95.6%). Conclusions: We have shown that considerable inconsistency exists in structural representation and systematic chemical identifiers within and between databases. This can have a great influence especially when merging data and if systematic identifiers are used as a key index for structure integration or cross-querying several databases. Regenerating systematic identifiers starting from their MOL representation and applying well-defined and documented chemistry standardisation rules to all compounds prior to creating them can dramatically increase internal consistency
Discovering information from an integrated graph database
The information explosion in science has become a different problem, not the sheer amount per se, but the multiplicity and heterogeneity of massive sets of data sources. Relations mined from these heterogeneous sources, namely texts, database records, and ontologies have been mapped to Resource Description Framework (RDF) triples in an integrated database. The subject and object resources are expressed as references to concepts in a biomedical ontology consisting of the Unified Medical Language System (UMLS), UniProt and EntrezGene and for the predicate resource to a predicate thesaurus. All RDF triples have been stored in a graph database, including provenance. For evaluation we used an actual formal PRISMA literature study identifying 61 cerebral spinal fluid biomarkers and 200 blood biomarkers for migraine. These biomarkers sets could be retrieved with weighted mean average precision values of 0.32 and 0.59, respectively, and can be used as a first reference for further refinements
The role of explainability in creating trustworthy artificial intelligence for health care: A comprehensive survey of the terminology, design choices, and evaluation strategies
Artificial intelligence (AI) has huge potential to improve the health and well-being of people, but adoption in clinical practice is still limited. Lack of transparency is identified as one of the main barriers to implementation, as clinicians should be confident the AI system can be trusted. Explainable AI has the potential to overcome this issue and can be a step towards trustworthy AI. In this paper we review the recent literature to provide guidance to rese
- …